Cross Linguistic Name Matching in English and Arabic

نویسندگان

  • Andrew Freeman
  • Sherri L. Condon
  • Christopher Ackerman
چکیده

This paper presents a solution to the problem of matching personal names in English to the same names represented in Arabic script. Standard string comparison measures perform poorly on this task due to varying transliteration conventions in both languages and the fact that Arabic script does not usually represent short vowels. Significant improvement is achieved by augmenting the classic Levenshtein edit-distance algorithm with character equivalency classes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Linguistic Transfer Revisited: The Case of English and Persian

The present study sought to investigate the evidence for cross-linguistic transfer in a partial English immersion and non-immersion educational setting. To this end, a total of 145 first, third and fifth graders in a partial English immersion program and 95 students from the same grade levels in a non-immersion program were chosen. Six different English and Persian tests were administered: the ...

متن کامل

The Role of Ethnicity in Integrative Tests Performances of Male/ Female Iranian English Learners of Different Language Proficiency Levels

Linguistic/cultural differences of learners’ native language with English as a foreign language, gender and English proficiency level are among those numerous variables which affect English learning and its quality in Iranian context. The present study was an attempt to illuminate the effects of these variables on performing integrative approach of general English tests (cloze test and recall t...

متن کامل

Cross-Language Personal Name Mapping

Name matching between multiple natural languages is an important step in cross-enterprise integration applications and data mining. It is difficult to decide whether or not two syntactic values (names) from two heterogeneous data sources are alternative designation of the same semantic entity (person), this process becomes more difficult with Arabic language due to several factors including spe...

متن کامل

Lexicalization vs. Vocalization: A Cross-Linguistic Study of Emphasis in English and Persian

Language is a system of verbal elements that makes communication of meaningspossible in the manners the users intend by employing certain linguistic deviceswhich are partly language-specific. Once communicating cross-linguistically, thereis always a risk of negative transfer of techniques or processes from the firstlanguage (L1) to the foreign language (L2). The current study investigates the“e...

متن کامل

A Cross-linguistic and Cross-cultural Study of Epistemic Modality Markers in Linguistics Research Articles

Epistemic modality devices are believed to be one of the prominent characteristics of research articles as the commonly used genre among the academic community members. Considering the importance of such devices in producing and comprehending scientific discourse, this study aimed to cross–culturally and cross-linguistically investigate epistemic modality markers as an important subcategory...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006